Third-order Smoothness Helps: Even Faster Stochastic Optimization Algorithms for Finding Local Minima
نویسندگان
چکیده
We propose stochastic optimization algorithms that can find local minima faster than existing algorithms for nonconvex optimization problems, by exploiting the third-order smoothness to escape non-degenerate saddle points more efficiently. More specifically, the proposed algorithm only needs Õ( −10/3) stochastic gradient evaluations to converge to an approximate local minimum x, which satisfies ‖∇f(x)‖2 ≤ and λmin(∇f(x)) ≥ − √ in the general stochastic optimization setting, where Õ(·) hides logarithm polynomial terms and constants. This improves upon the Õ( −7/2) gradient complexity achieved by the state-of-the-art stochastic local minima finding algorithms by a factor of Õ( −1/6). For nonconvex finite-sum optimization, our algorithm also outperforms the best known algorithms in a certain regime.
منابع مشابه
Local Smoothness in Variance Reduced Optimization
We propose a family of non-uniform sampling strategies to provably speed up a class of stochastic optimization algorithms with linear convergence including Stochastic Variance Reduced Gradient (SVRG) and Stochastic Dual Coordinate Ascent (SDCA). For a large family of penalized empirical risk minimization problems, our methods exploit data dependent local smoothness of the loss functions near th...
متن کاملNeon2: Finding Local Minima via First-Order Oracles
We propose a reduction for non-convex optimization that can (1) turn an stationary-point finding algorithm into an local-minimum finding one, and (2) replace the Hessian-vector product computations with only gradient computations. It works both in the stochastic and the deterministic settings, without hurting the algorithm’s performance. As applications, our reduction turns Natasha2 into a firs...
متن کاملFastest Rates for Stochastic Mirror Descent Methods
Relative smoothness a notion introduced in [6] and recently rediscovered in [3, 18] generalizes the standard notion of smoothness typically used in the analysis of gradient type methods. In this work we are taking ideas from well studied field of stochastic convex optimization and using them in order to obtain faster algorithms for minimizing relatively smooth functions. We propose and analyze ...
متن کاملEfficient approaches for escaping higher order saddle points in non-convex optimization
Local search heuristics for non-convex optimizations are popular in applied machine learning. However, in general it is hard to guarantee that such algorithms even converge to a local minimum, due to the existence of complicated saddle point structures in high dimensions. Many functions have degenerate saddle points such that the first and second order derivatives cannot distinguish them with l...
متن کاملLow-thrust Orbit Transfer Optimization Using Genetic Search
Most techniques for solving dynamic optimization problems involve a series of gradient computations and one-dimensional searches at some point in the optimization process. A large class of problems, however, does not possess the necessary smoothness properties that such algorithms require for good convergence. Even when smoothness conditions are met, poor initial guesses at the solution often r...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1712.06585 شماره
صفحات -
تاریخ انتشار 2017